Hard-Clustering with Gaussian Mixture Models
نویسندگان
چکیده
Training the parameters of statistical models to describe a given data set is a central task in the field of data mining and machine learning. A very popular and powerful way of parameter estimation is the method of maximum likelihood estimation (MLE). Among the most widely used families of statistical models are mixture models, especially, mixtures of Gaussian distributions. A popular hard-clustering variant of the MLE problem is the so-called completedata maximum likelihood estimation (CMLE) method. The standard approach to solve the CMLE problem is the Classification-Expectation-Maximization (CEM) algorithm [CG92]. Unfortunately, it is only guaranteed that the algorithm converges to some (possibly arbitrarily poor) stationary point of the objective function. In this paper, we present two algorithms for a restricted version of the CMLE problem. That is, our algorithms approximate reasonable solutions to the CMLE problem which satisfy certain natural properties. Moreover, they compute solutions whose cost (i.e. complete-data log-likelihood values) are at most a factor (1 + ε) worse than the cost of the solutions that we search for. Note the CMLE problem in its most general, i.e. unrestricted, form is not well defined and allows for trivial optimal solutions that can be thought of as degenerated solutions.
منابع مشابه
Strong Coresets for Hard and Soft Bregman Clustering with Applications to Exponential Family Mixtures
Coresets are e cient representations of data sets such that models trained on the coreset are provably competitive with models trained on the original data set. As such, they have been successfully used to scale up clustering models such as K-Means and Gaussian mixture models to massive data sets. However, until now, the algorithms and the corresponding theory were usually specific to each clus...
متن کاملIMAGE SEGMENTATION USING GAUSSIAN MIXTURE MODEL
Stochastic models such as mixture models, graphical models, Markov random fields and hidden Markov models have key role in probabilistic data analysis. In this paper, we have learned Gaussian mixture model to the pixels of an image. The parameters of the model have estimated by EM-algorithm. In addition pixel labeling corresponded to each pixel of true image is made by Bayes rule. In fact, ...
متن کاملImage Segmentation using Gaussian Mixture Model
Abstract: Stochastic models such as mixture models, graphical models, Markov random fields and hidden Markov models have key role in probabilistic data analysis. In this paper, we used Gaussian mixture model to the pixels of an image. The parameters of the model were estimated by EM-algorithm. In addition pixel labeling corresponded to each pixel of true image was made by Bayes rule. In fact,...
متن کاملClustering subgaussian mixtures by semidefinite programming
We introduce a model-free relax-and-round algorithm for k-means clustering based on a semidefinite relaxation due to Peng and Wei [PW07]. The algorithm interprets the SDP output as a denoised version of the original data and then rounds this output to a hard clustering. We provide a generic method for proving performance guarantees for this algorithm, and we analyze the algorithm in the context...
متن کاملNonparametric Bayesian Clustering via Infinite Warped Mixture Models
We introduce a flexible class of mixture models for clustering and density estimation. Our model allows clustering of non-linearly-separable data, produces a potentially low-dimensional latent representation, automatically infers the number of clusters, and produces a density estimate. Our approach makes use of two tools from Bayesian nonparametrics: a Dirichlet process mixture model to allow a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1603.06478 شماره
صفحات -
تاریخ انتشار 2016